CSE 444 Final
-- Solutions
December 8, 2000
2:30pm – 4:20pm
Total number of points:
100
Total time: 1h 50’
-
-
We add two arrows:
The representation is not faithful.
- We
define the following tables in SQL: Employee, Manager, Purchase, Buys.
CREATE TABLE
Employee (
id CHAR(10) PRIMARY KEY,
name VARCHAR(30),
phone CHAR(10),
ManagedBy CHAR(10)
REFERENCES Manager(id)
);
CREATE TABLE Manager (
id CHAR(10) PRIMARY KEY
REFERENCES Empolyee(id),
level INT
);
CREATE TABLE Purchase (
purchaseNo CHAR(20) PRIMARY
KEY,
date DATE,
amount INT
);
/* Buys could be integrated into Purchase */
CREATE TABLE Buys (
purchase CHAR(20) PRIMARY
KEY REFERENCES Purchase(purchaseNo),
beneficiary CHAR(10)
REFERENCES Employee(id),
approves CHAR(10)
REFERENCES Manager(id)
);
Note: the above script will not run on SQL Server because of the cyclic
references between Employee and Manager. Instead we need to declare the
foreign key ManagedBy by altering the table Employee. Also, level is a
keyword. The following script runs on SQL Server.
CREATE
TABLE Employee (
id CHAR(10) PRIMARY KEY,
name VARCHAR(30),
phone CHAR(10),
ManagedBy CHAR(10)
);
CREATE TABLE Manager (
id CHAR(10) PRIMARY KEY
REFERENCES Empolyee(id),
thelevel INT
);
ALTER TABLE Employee ADD FOREIGN KEY (ManagedBy) REFERENCES Manager(id);
CREATE TABLE Purchase (
purchaseNo CHAR(20) PRIMARY
KEY,
date DATETIME
);
CREATE TABLE Buys (
purchase CHAR(20) PRIMARY
KEY REFERENCES Purchase(purchaseNo),
beneficiary CHAR(10)
REFERENCES Employee(id),
approves CHAR(10)
REFERENCES Manager(id)
);
-
i.
The Query is:
SELECT m.id, em.name,
sum(p.amount)
FROM Manager m, Employee em, Employee
e, Buys b, Purchase p
WHERE m.id=em.id
AND e.name = “Smith”
AND b.purchase = p.purchaseNo AND b.beneficiary=e.id
ANDb.approves=m.id
GROUP BY m.id, em.name
ii.
The query is:
SELECT DISTINCT p.purchaseNo, p.amount, e.name
FROM Purchase p, Buys b, Manager m,
Empolyee e
WHERE p.purchase = b.purchase AND m.id = b.approves
AND e.id = .beneficiary AND m.id = e.id
- We
run the following SQL queries in sequence:
INSERT INTO Manager(id, level)
SELECT DISTINCT managedBy AS
id, 1 AS level
FROM Empolyee
WHERE managedBy IS NOT NULL;
UPDATE Manager
SET level = 2
WHERE id IN (SELECT m2.id
FROM
Empolyee e, Manager m1, Manager m2
WHERE e.id = m1.id AND e.managedBy =
m2.id)
UPDATE Manager
SET level = 3
WHERE id IN (SELECT m3.id
FROM
Empolyee e, Manager m2, Manager m3
WHERE
e.id = m2.id AND e.managedBy = m3.id
AND m2.level=2)
UPDATE Manager
SET level = 4
WHERE id IN (SELECT m4.id
FROM
Empolyee e, Manager m3, Manager m4
WHERE
e.id = m3.id AND e.managedBy = m4.id
AND m3.level=3)
- The
answers are:
- Yes,
no, no:
i.
Yes: if some attribute A is in X+, it means that X ý A.
But then we have Y ý A, hence A is in Y+.
ii.
No: Consider a relation R(A,B,C) with one fd: AB ý C.
Take X=A, Y=B. Then:
X+ = A, Y+ = B,
X+ Ý Y+
= AB, and (X Ý Y)
+=(AB)+ = ABC
iii.
No: Consider a relation R(A,B,C,D) with three fds: AB ý D,
AC ý
D. Take X=AB, Y=AC. Then X+ Ü Y+ = (ABD) Ü (ACD) = AD while (X Ü Y)
+ = A+ = A.
- We
compute X+ for every subset X of ABCD:
A+ = A (e.g. to
see why B is not in A+, consider row 2 (A=1,B=1) and 4 (A=1,B=2))
B+=B
C+=C
D+=BCD
AB+=ABC (to see why D is not in AB+ consider the first two rows)
AC+=AC (to see why B is not in
AC+ consider rows 1 and 4)
AD+=ABCD this is a minimal key
ABC+=ABC (to see why D is not in ABC+ consider the first two rows)
ABD+=ABCD (follows immediately from AD being a key)
ACD+=ABCD (follows immediately from AD being a key)
BCD+=BCD (to see why A is not in BCD+ consider rows 1 and 3)
There is only one minimal key: AD.
A basis for all functional dependencies is: D ý
BC, AB ý
C
- The
answers:
- One
record in Product takes 100 bytes and we can put 10 in a block, hence
B(Product)=T(Product)/10=100,000
One record in Sales takes 50 bytes and we can put 20 in a block, hence
B(Sales)=T(Sales)/20=100,000.
- The
cost of the plan is given by the cost of the join:
B(Product)+B(Product)B(Sales)/(M-1) = 100,000 +
100,000*100,000/100=100,100,000=(approx) 100M.
- Let
Product’ and Sales’ be the left and right operand of the join respectively.
We have:
T(Product’) = 10,000, record size = 20, records per block = 50
B(Product’) = 10,000/50 = 200
T(Sales’) = 40,000, record size = 16, records per block= 1000/16
B(Sales’) = 40,000/(1000/16) = 640 (approx)
It is more advantageous to use Product’ as outer relation. That is, the
bock-nested loop will read data from Product, perform the projection and
selection on the fly and store 100 blocks of records from Product’ in
main memory. For each such set it will do a full scan of Sales, perform the
selection and projection, then the join. Total cost:
B(Product) + B(Product’)/100 * B(Sales) = 100,000 + 200/100*100,000=100,000 +
200,000=300,000
- The
answer is:
- Scanning
the log from the end to the beginning we find:
Uncommitted transactions are: T6, T4, T3. These need to be run again by
the application
Committed transactions are: T5, T1, T2
For recovery we scan the log from the bottom up to <start
ckpt(T1,T2)> and undo the updates of transactions T6, T4, T3:
X = 1
Y = 1, 12
Z = 3, 5
U = 4, 9, 8
V = 5
- Scanning
the log from the end to the beginning we find:
Uncommitted transactions: T6, T5, T3. These need to be run again by the
application.
Committed transactions: T4, T1, T2
For recovery: last successful checkpoint was <start ckpt(T1,T3)>.
Need to start at the earliest between <start T1> and <start
T3>: hence from the beginning of the log. Redo the updates of
transactions T4, T1, T2:
X = 1, 4
Y = 2
Z = 3
U = 4, 8, 9
V = 5